fix(plugin): replace misleading Cache:% status-bar metric with raw cache token display#1359
Conversation
…che token display The Cache:XX% segment derived from context_window.current_usage only reflects the most recent API call, not session-wide cache efficiency. Users frequently misread it as cumulative cache hit rate. Replace compute_cache_hit_rate() with format_cache_segment() that renders raw token values (e.g. ♻2k/3.5k) with the following semantics: - numerator = cache_read_input_tokens - denominator = input_tokens + cache_creation_input_tokens + cache_read_input_tokens - values represent the latest API call, not session totals Also add format_compact_tokens() helper for k-suffix compact rendering (532 → 532, 1000 → 1k, 1500 → 1.5k, 128000 → 128k). Safe fallback: when current_usage is missing/null/zero, the cache segment is omitted entirely so the status line still renders without a broken slot. Test coverage (#1354): - format_cache_segment: 7 cases covering empty, null, input-only, partial, full, large-value k-format, and no-percent regression - format_status_line integration: 3 cases locking in the new output contract and guarding against Cache:% regression Closes #1355 Closes #1354
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Conductor Self-Review (Self-Approve Unavailable)Since GitHub blocks approving one's own PR, posting this as a review comment. CI is green and acceptance criteria are satisfied — ready to merge. CI StatusAll 20 checks green — lint, format, typecheck, test, circular, build, security, validate-commands, rules-validation. Strengths
Devil's Advocate
Minor follow-up suggestions (non-blocking)These are nice-to-haves and can be addressed later if desired:
None of these block the merge. Ready to go in. |
JeremyDev87
left a comment
There was a problem hiding this comment.
EVAL Mode Review — PR #1359
CI Status
PASS — 29/29 jobs green (lint, format, typecheck, test:coverage, circular, build, security audit 전체 통과)
로컬 재검증:
yarn workspace codingbuddy-claude-plugin lint✓yarn workspace codingbuddy-claude-plugin format:check✓yarn workspace codingbuddy-claude-plugin typecheck✓python3 -m pytest packages/claude-code-plugin/tests/test_hud.py -v→ 105/105 pass
Severity Summary
- Critical: 0
- High: 0
- Medium: 2
- Low: 3
Findings
Critical (0)
없음.
High (0)
없음.
Medium (2)
M1. format_compact_tokens 전용 단위 테스트 누락
- 현재
format_compact_tokens는format_cache_segment를 통해서만 간접 검증됨 (test_hud.py에TestFormatCompactTokens클래스 없음) - 누락된 경계 케이스:
999(< 1000 경로),1000(정확 경계),1001(trim 로직 엣지),None/잘못된 타입 (try/except 경로) - 이 함수는 status-bar 외에도 재사용 가능한 퍼블릭 헬퍼이므로 독립 테스트로 계약을 고정하는 것을 권장
- 제안:
TestFormatCompactTokens클래스 추가 (6-8 케이스) — follow-up PR 또는 #1357 에 편승 가능
M2. format_compact_tokens 포맷 일관성 미세 이슈
1000 → "1k"(정수 경로, trim 적용)1001 → "1.0k"(k = 1.001,k != int(k)→f"{1.001:.1f}k"→"1.0k")- docstring은 "trimmed of trailing
.0"이라고 명시하지만 1001 ~ 1049 구간은1.0k로 표시되어 시각적으로 "trim" 되지 않은 것처럼 보임 (실제로는.1f포맷의 결과) - 기능적 결함은 아니나 문서와 동작이 미세하게 어긋남 — 테스트로 실제 동작을 고정하거나 docstring을 "near-thousand values may display as
1.0k"로 명확화 권장
Low (3)
L1. format_compact_tokens 에러 폴백 미문서화
try/except (TypeError, ValueError)로'0'반환 경로가 docstring에 언급되지 않음- 한 줄 추가 권장: "Returns
'0'when input is not a valid integer."
L2. 유니코드 글리프 인라인 사용
\u267b(♻) 가format_cache_segment내부에 직접 삽입됨- 파일 상단 상수(
CACHE_RECYCLE_GLYPH = \"\\u267b\")로 추출하면 의도가 명확해지고 향후 테마/커스터마이즈 지점이 생김 - 순수 nit — 현재도 동작에 문제 없음
L3. status-bar-model.md 예제 7개 잔존
packages/claude-code-plugin/docs/status-bar-model.md에Cache:XX%예제 7곳 남아 있음 (라인 23, 34, 166, 172, 179, 186, 193)- PR body에 명시적으로 #1357 Wave 2로 위임됨 (파일 겹침 회피 목적)
- 본 PR의 블로커 아님 — 트래킹용 참고
Spec Compliance
#1355 acceptance criteria — ✓ 전부 충족
Cache:XX%제거 확인 (test_status_line_no_longer_contains_cache_percent)cache_read_input_tokens분자 /input + cache_create + cache_read분모 계약 구현- status-bar용 compact 포맷 (
♻2k/3.5k) current_usage없음/null → 세그먼트 완전 생략 (테스트에 명시)- 작은 값/큰 값 안정적 포맷 (532, 1k, 1.5k, 128k)
#1354 acceptance criteria — ✓ 전부 충족 (7/7 케이스)
- ✓
test_no_context_window - ✓
test_null_current_usage - ✓
test_input_tokens_only_no_cache_read - ✓
test_partial_cache_read - ✓
test_full_cache_read_shows_raw_not_100pct - ✓
test_large_values_use_k_format - ✓
test_status_line_contains_raw_cache_tokens+test_regression_no_percent_in_output
#1356 acceptance criteria — ✓ 코드 범위 충족 (docs 범위는 #1357 위임)
- 상태바에서
Cache:XX%제거 ✓ - raw token semantics ✓
- 회귀 테스트로
%복귀 방지 ✓ (test_regression_no_percent_in_output,test_status_line_no_longer_contains_cache_percent) - docstring으로 last-call semantics 명확화 ✓ (
format_cache_segment9줄 docstring) ⚠️ status-bar-model.md예제 업데이트는 #1357 follow-up 위임
Additional Verification
Backward compatibility — ✓ 안전
compute_cache_hit_rate전역 grep 결과: 코드 내 참조 0건- 유일한 잔존 참조는
docs/plans/2026-03-28-wave2-statusline-mode-detect.md(과거 계획 문서, 변경 불필요)
Security — 관련 위험 없음 (read-only display helper, 외부 입력 직접 처리 없음)
Performance — 관련 위험 없음 (status-bar 업데이트당 1회 호출, O(1))
Recommendation
APPROVE (with follow-up suggestions)
Reasoning:
- 0 Critical + 0 High → 머지 차단 요소 없음
#1355/#1354수락 기준을 전부 코드와 테스트로 충족compute_cache_hit_rate제거는 안전 (외부 caller 없음)- TDD RED-GREEN pair로 잘 묶여 있으며 atomic merge 필요성이 PR body에 명시됨
- CI 전체 통과 + 로컬 재검증 전부 통과 (105/105 test, lint/format/typecheck clean)
- Medium/Low 항목은 모두 품질 개선 제안이며 본 PR 범위를 벗어나는 follow-up으로 처리 가능
동일 작성자라 --approve는 불가하여 --comment로 승인 의사 표명합니다. 머지 진행 권장.
Reviewed by code-reviewer (EVAL mode) via codingbuddy parse_mode
Review Cycle Complete — APPROVED ✅Review panel summary (from EVAL mode reviewer, see previous review comment):
Loop termination condition met: Critical = 0 AND High = 0. CI is green (29/29), local verification passes (105/105 python tests, lint/format/typecheck clean), and all acceptance criteria for #1355 and #1354 are satisfied. The Medium/Low findings from the reviewer are tracked as follow-up suggestions and do not block merge. Ready for user to merge. Closing the review panel. |
… display Document why the status-bar cache segment renders raw tokens (♻N/M) instead of a percentage, explaining the last-call semantics of context_window.current_usage from Claude Code stdin. status-bar-model.md: - Update the example status line and segment table to show the new ♻2k/3.5k format instead of the deprecated Cache:XX% - Add a new "Cache Segment Semantics (Last-Call Only)" subsection explaining numerator/denominator, format rules, fallback behavior, and a contributor caution against reintroducing percentage rendering - Refresh all 5 Mode Examples (PLAN/ACT/EVAL/AUTO/Ready) to match the current v5.3.0 output - Add an explicit note on the Ready state that the cache segment is hidden when no API calls have been made yet (documented fallback, not a bug) codingbuddy-hud.py: - Expand format_compact_tokens docstring with explicit output rules, the 1000-1049 rounding note, and the error fallback contract (returns "0" when input is not coercible to int — never raises) - Expand format_cache_segment docstring with the last-call rationale, fallback conditions, and a contributor caution mirrored in the docs This PR is docs-only — no behavior changes. Regression tests from PR #1359 continue to guard against Cache:XX% reintroduction. Closes #1357
… display Document why the status-bar cache segment renders raw tokens (♻N/M) instead of a percentage, explaining the last-call semantics of context_window.current_usage from Claude Code stdin. status-bar-model.md: - Update the example status line and segment table to show the new ♻2k/3.5k format instead of the deprecated Cache:XX% - Add a new "Cache Segment Semantics (Last-Call Only)" subsection explaining numerator/denominator, format rules, fallback behavior, and a contributor caution against reintroducing percentage rendering - Refresh all 5 Mode Examples (PLAN/ACT/EVAL/AUTO/Ready) to match the current v5.3.0 output - Add an explicit note on the Ready state that the cache segment is hidden when no API calls have been made yet (documented fallback, not a bug) codingbuddy-hud.py: - Expand format_compact_tokens docstring with explicit output rules, the 1000-1049 rounding note, and the error fallback contract (returns "0" when input is not coercible to int — never raises) - Expand format_cache_segment docstring with the last-call rationale, fallback conditions, and a contributor caution mirrored in the docs This PR is docs-only — no behavior changes. Regression tests from PR #1359 continue to guard against Cache:XX% reintroduction. Closes #1357
Summary
The status-bar
Cache:XX%segment derived fromcontext_window.current_usageonly reflects the most recent API call, not session-wide cache efficiency. Users frequently misread it as a cumulative cache hit rate (e.g. seeingCache:100%and assuming the whole session is fully cached).This PR replaces the misleading percentage with a raw token display (
♻2k/3.5k) and adds regression coverage to prevent reverting to%-based rendering.Changes
Implementation (#1355)
compute_cache_hit_rate()— the%calculation was mathematically correct but semantically misleadingformat_cache_segment(ctx_window)— renders♻{cache_read}/{total}with last-call semanticscache_read_input_tokensinput_tokens + cache_creation_input_tokens + cache_read_input_tokensformat_compact_tokens(n)helper —532→532,1000→1k,1500→1.5k,128000→128kformat_status_line()— omit the cache slot entirely when usage data is missing, so the status line still renders cleanlyRegression tests (#1354)
TestFormatCacheSegment(7 tests): empty ctx, null usage, input-only, partial read, full read, largekvalues, explicit%regression guardTestFormatStatusLineCacheSegment(3 tests): final status-line output locks in the new contract and explicitly assertsCache:never appearsOutput comparison
Before:
◕‿◕ CB v5.3.0 | PLAN 🟢 | 12m | ~$0.42 | Cache:53% | Ctx:45% | OpusAfter:
◕‿◕ CB v5.3.0 | PLAN 🟢 | 12m | ~$0.42 | ♻800/1.5k | Ctx:45% | OpusTest plan
python3 -m pytest tests/test_hud.py— 105/105 pass (was 99, net +6 after replacing 4 percentage tests with 10 raw-token tests)python3 -m pytest tests/— 748 pass (full plugin test suite)python3 -m pytest hooks/tests/— 239 passyarn workspace codingbuddy-claude-plugin lint— cleanyarn workspace codingbuddy-claude-plugin format:check— cleanyarn workspace codingbuddy-claude-plugin typecheck— cleanyarn workspace codingbuddy-claude-plugin test:coverage— 123/123 pass, 100% stmtsyarn workspace codingbuddy-claude-plugin circular— no cyclesyarn workspace codingbuddy-claude-plugin build— successTDD pair note
#1354 (test) and #1355 (fix) are a TDD RED-GREEN pair. They touch different files but are semantically inseparable — merging either in isolation breaks CI. This PR lands them together as a single atomic change.
Follow-up
status-bar-model.mdand add docstring clarification in a follow-up PR (Wave 2) to avoid file overlap with this PR.Closes #1355
Closes #1354